Targeted Learning in Data Science by Mark J. van der Laan & Sherri Rose

Targeted Learning in Data Science by Mark J. van der Laan & Sherri Rose

Author:Mark J. van der Laan & Sherri Rose
Language: eng
Format: epub
Publisher: Springer International Publishing, Cham


16.4.3 Outcome Regressions, Q 0 θ

The conditional expectations Q 0 θ required to implement the TMLE are unknown and thus need to be estimated first. Below, we describe the two data-adaptive estimators that were considered as initial estimators of the nuisance parameters . For a given k = t 0, …, 0, these estimators are derived using only data from patients who did not fail before time k and who followed rule d θ through k (i.e., and ). The resulting TMLE are referred to as stratified TMLE by opposition to pooled TMLE in which the initial estimators of Q 0 θ are derived by pooling data from all patients who did not fail before time k (whether or not they followed the dynamic intervention previously) before evaluating these initial estimators at . Note that in studies with small sample sizes. a stratified approach will often not be practical for proper initial estimation of the nuisance parameters Q 0 θ and extrapolation using data from patients who did not experience the relevant treatment history can then improve TMLE performance.

DSA. The Deletion/Substitution/Addition (DSA) algorithm (Sinisi and van der Laan 2004; Neugebauer and Bullard 2010) implements a data-adaptive estimator selection procedure based on cross-validation. It can be used as a machine learning approach for estimating conditional expectations. Here, the DSA was used as an initial estimator of Q 0 θ based on candidate estimators that were restricted to main-term logistic models with the following candidate explanatory variables: all time-independent covariates, the last measurement of time-varying covariates, and the latest change in A1c. To decrease computing time, the DSA was implemented with a single 5-fold cross-validation split, without deletion and substitution moves, and with a maximum model size (i.e., number of main terms in the logistic models) equal to 10. The resulting estimator of Q 0 θ is denoted by Q n, DSA θ .

Super Learner. Based on the same rationale that motivated the use of SL to estimate the action mechanism, SL was also considered here to define the initial estimator of the nuisance parameter Q 0 θ . The details of the SL approach that was implemented are described in Neugebauer et al. (2014a). In short, for each time point k a separate super learner was constructed based on the following eight classes of candidate learners (each learner used a different subset of explanatory variables): (a) seven learners defined by main-term logistic models; (b) five learners defined by a stepwise model selection using AIC; (c) five learners defined by neural networks; (d) five learners defined by Bayes regressions; (e) five learners defined by polychotomous regressions; (f) five learners defined by Random Forests; (g) five learners defined by bagging for classification trees; (h) 20 learners defined by generalized additive models with smoothing splines. The set of explanatory variables considered included the variables used in the DSA approach but was also expanded to include two-way interaction terms between these variables and additional summary measures of past covariates (e.g., the average of all past A1c measurement or the number of past A1c measurements above 8%).



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.